Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added cleanup code for gcp resources #1453

Merged

Conversation

AnkitCLI
Copy link
Contributor

This PR contains after hooks for test cases in which the after hook to clean up resources is not available.

@vikasrathee-cs vikasrathee-cs added the build Trigger unit test build label Oct 16, 2024
@AnkitCLI AnkitCLI changed the title Added cleanup code for gcs resources Added cleanup code for gcp resources Oct 16, 2024
Copy link
Member

@itsankit-google itsankit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why there are test failures with error in Get count of no of records transferred to target BigQuery Table:

om.google.cloud.bigquery.BigQueryException: Not found: Table cdf-athena:testing_bqmt.E2E_TARGET_cd33fe5f_1249_46b1_8fc3_b8da7dbce6f1 was not found in location US
	at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
	at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.queryRpc(HttpBigQueryRpc.java:652)
	at com.google.cloud.bigquery.BigQueryImpl$35.call(BigQueryImpl.java:1282)
	at com.google.cloud.bigquery.BigQueryImpl$35.call(BigQueryImpl.java:1279)
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
	at com.google.cloud.bigquery.BigQueryRetryHelper.run(BigQueryRetryHelper.java:64)
	at com.google.cloud.bigquery.BigQueryRetryHelper.runWithRetries(BigQueryRetryHelper.java:38)
	at com.google.cloud.bigquery.BigQueryImpl.queryRpc(BigQueryImpl.java:1278)
	at com.google.cloud.bigquery.BigQueryImpl.query(BigQueryImpl.java:1266)
	at io.cdap.e2e.utils.BigQueryClient.getSoleQueryResult(BigQueryClient.java:86)
	at io.cdap.e2e.utils.BigQueryClient.countBqQuery(BigQueryClient.java:46)
	at io.cdap.plugin.bigquery.stepsdesign.BigQueryBase.getCountOfNoOfRecordsTransferredToTargetBigQueryTable(BigQueryBase.java:90)
	at ✽.Get count of no of records transferred to target BigQuery Table(file:src/e2e-test/features/bigquery/sink/GCSToBigQuery.feature:39)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
POST https://www.googleapis.com/bigquery/v2/projects/cdf-athena/queries
{
  "code" : 404,
  "errors" : [ {
    "domain" : "global",
    "message" : "Not found: Table cdf-athena:testing_bqmt.E2E_TARGET_cd33fe5f_1249_46b1_8fc3_b8da7dbce6f1 was not found in location US",
    "reason" : "notFound"
  } ],
  "message" : "Not found: Table cdf-athena:testing_bqmt.E2E_TARGET_cd33fe5f_1249_46b1_8fc3_b8da7dbce6f1 was not found in location US",
  "status" : "NOT_FOUND"

@itsankit-google
Copy link
Member

Why there are test failures with error in Get count of no of records transferred to target BigQuery Table:

om.google.cloud.bigquery.BigQueryException: Not found: Table cdf-athena:testing_bqmt.E2E_TARGET_cd33fe5f_1249_46b1_8fc3_b8da7dbce6f1 was not found in location US
	at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.translate(HttpBigQueryRpc.java:115)
	at com.google.cloud.bigquery.spi.v2.HttpBigQueryRpc.queryRpc(HttpBigQueryRpc.java:652)
	at com.google.cloud.bigquery.BigQueryImpl$35.call(BigQueryImpl.java:1282)
	at com.google.cloud.bigquery.BigQueryImpl$35.call(BigQueryImpl.java:1279)
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
	at com.google.cloud.bigquery.BigQueryRetryHelper.run(BigQueryRetryHelper.java:64)
	at com.google.cloud.bigquery.BigQueryRetryHelper.runWithRetries(BigQueryRetryHelper.java:38)
	at com.google.cloud.bigquery.BigQueryImpl.queryRpc(BigQueryImpl.java:1278)
	at com.google.cloud.bigquery.BigQueryImpl.query(BigQueryImpl.java:1266)
	at io.cdap.e2e.utils.BigQueryClient.getSoleQueryResult(BigQueryClient.java:86)
	at io.cdap.e2e.utils.BigQueryClient.countBqQuery(BigQueryClient.java:46)
	at io.cdap.plugin.bigquery.stepsdesign.BigQueryBase.getCountOfNoOfRecordsTransferredToTargetBigQueryTable(BigQueryBase.java:90)
	at ✽.Get count of no of records transferred to target BigQuery Table(file:src/e2e-test/features/bigquery/sink/GCSToBigQuery.feature:39)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
POST https://www.googleapis.com/bigquery/v2/projects/cdf-athena/queries
{
  "code" : 404,
  "errors" : [ {
    "domain" : "global",
    "message" : "Not found: Table cdf-athena:testing_bqmt.E2E_TARGET_cd33fe5f_1249_46b1_8fc3_b8da7dbce6f1 was not found in location US",
    "reason" : "notFound"
  } ],
  "message" : "Not found: Table cdf-athena:testing_bqmt.E2E_TARGET_cd33fe5f_1249_46b1_8fc3_b8da7dbce6f1 was not found in location US",
  "status" : "NOT_FOUND"

Fixed by cdapio/cdap#15724.

@@ -1636,4 +1640,12 @@ public static void deleteTargetBqmtTable() throws IOException, InterruptedExcept
}
}
}

@After(order = 1, value = "@BQ_SECOND_RECORD_SOURCE_TEST")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be BQ_RECORD_SOURCE_TEST? I don't see this tag being used anywhere.

Copy link
Contributor Author

@AnkitCLI AnkitCLI Oct 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are using this is in bqtobq Additional.feature file. so BQ_SECOND_RECORD_SOURCE_TEST is used here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, can we please rename both these tags to something meaningful?

It is hard to understand what these both tags are referring to BQ_RECORD_SOURCE_TEST & BQ_SECOND_RECORD_SOURCE_TEST?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed done.

Copy link
Member

@itsankit-google itsankit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@After(order = 1, value = "@BQ_SECONDARY_RECORD_SOURCE_TEST")
public static void deleteTempSource2BQTable() throws IOException, InterruptedException {
BigQueryClient.dropBqQuery(bqSourceTable2);
PluginPropertyUtils.removePluginProp("bqSourceTable2");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before removing the plugin property, we need to fetch it for logging right?

Otherwise bqSourceTable2 would be empty?

Shouldn't it be something like:

bqSourceTable2 = PluginPropertyUtils.pluginProp("bqSourceTable2");
PluginPropertyUtils.removePluginProp("bqSourceTable2");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the logger is at right position. The way you have written is also right but we already assigned globally assigned bqSourceTable2 in before hook. so no need to add in after hook. And either after removing PluginPropertyUtils.removePluginProp("bqSourceTable2") we still have globally assigned bqSourceTable2 so we can use logger after remove step.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source of truth here would be the pluginProperties file so I think we should always fetch the values from there instead of relying on the before hook.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the step

@itsankit-google itsankit-google dismissed their stale review October 23, 2024 04:45

Requested changes.

@AnkitCLI
Copy link
Contributor Author

Similar comment in other tests as well, for example: image

https://github.com/data-integrations/google-cloud/pull/1453/files#diff-d7999135c7a28d37bb9cad09ae4452055ce57b2f8fcfdaf123053690205b6cdeR1613-R1614
As in this code we are not using globally assigned target tables we are just using table names that we provided in plugin parameters file so that's it is not giving the target table names in this one.

@itsankit-google
Copy link
Member

As in this code we are not using globally assigned target tables we are just using table names that we provided in plugin parameters file so that's it is not giving the target table names in this one.

Even though plugin automatically creating the tables while writing, we do need to clean them up, and we should also correctly log which tables are getting deleted.

@AnkitCLI
Copy link
Contributor Author

As in this code we are not using globally assigned target tables we are just using table names that we provided in plugin parameters file so that's it is not giving the target table names in this one.

Even though plugin automatically creating the tables while writing, we do need to clean them up, and we should also correctly log which tables are getting deleted.

added the logger step and removed the step for clean up

@itsankit-google itsankit-google merged commit b242c36 into data-integrations:develop Oct 23, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Trigger unit test build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants